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Mean-field analysis is an important tool for understanding dynamics on complex networks. How- 
ever, surprisingly little attention has been paid to the question of whether mean-field predictions 
are accurate, and this is particularly true for real-world networks with clustering and modular struc- 
ture. In this paper, we compare mean-field predictions to numerical simulation results for dynamical 
processes running on 21 real- world networks and demonstrate that the accuracy of the theory de- 
pends not only on the mean degree of the networks but also on the mean first-neighbor degree. 
We show that mean-field theory can give (unexpectedly) accurate results for certain dynamics on 
disassortative real-world networks even when the mean degree is as low as 4. 

PACS numbers: 89. 75. He, 64.60. aq, 89.75. Fb, 87.23.Ge 



I. INTRODUCTION 

Mean-field theories are the most common form of ana- 
lytical approximation employed when studying dynamics 
on complex networks Typically, mean- field theories 
are derived under several assumptions: 

(i) Absence of local clustering. When considering pos- 
sible changes to the state of a node B\ , it is assumed 
that the states of the neighbors of node B\ are in- 
dependent of each other. However, this assumption 
that the network is "locally tree-like" does not hold 
if the neighbors of B\ arc also linked to each other — 
i.e., if the network is clustered (exhibits transitiv- 

ity). 

(ii) Absence of modularity. It is also usually assumed 
that all nodes of the same degree k are well- 
described by the mean fc-class state — i.e. by the 
average over all nodes of degree k. However, this 
might not be true if the network is modular, so 
that the states of degree-fc nodes arc differently dis- 
tributed in different communities. 

(hi) Absence of dynamical correlations. Finally, it is 
assumed that the states of each node B\ and those 
of its neighbors can be treated as independent when 
updating the state of node B\ . 

Importantly, the neglect of dynamical correlations (as 
distinct from structural correlations such as degree- 
degree correlations [H, Q ) between neighbors in assump- 
tion (hi) can be addressed in improved theories that in- 
corporate information on the joint distribution of node 
states at the ends of a random edge in the network 0, Q 



(cf. theories that only specify the structures at the ends 
of a random edge). The improved theories are often 
called pair- approximations (PA) (examples are 0, 0]), 
and these are inevitably more complicated to derive and 
study than mean-field (MF) theories, so we mostly re- 
strict our attention in the present paper to the more 
common MF-theory situation (72[. 

The distinctions between assumptions (i) — (iii) can be 
clarified by considering the theoretical approaches be- 
yond the MF level that have been developed in certain 
cases to deal with violations of (i), (ii), and (iii). The im- 
pact of non-zero clustering on percolation problems on a 
network has been examined in Rcfs. |8j-Tl2[| . The ana- 
lytical methods used in those papers explicitly account 
for the dependence of neighbors' states on each other — 
i.e., for the violation of MF assumption (i). The role of 
community or modular structures [see assumption (ii)] 
on percolation requires a different extension of analyti- 
cal methods [ill, (3]. As noted above, PA theories can 
account for dynamical correlations more accurately than 
MF theory, thereby improving on assumption (iii). 

The MF assumptions enumerated above are clearly vi- 
olated for real- world networks, which are often highly 
clustered and modular [H, [H|. It is therefore rather 
surprising that MF theory often provides a reasonably 
good approximation to the actual dynamics on many 
real-world networks. This fact has been noted by several 
authors 

0,1 EMI, 

but to our knowledge no compre- 
hensive explanation for this phenomenon has ever been 
developed. In studying this phenomenon, we focus on a 
specific question of obvious practical interest: Given a 
real-world network and a dynamical process running on 
it, is it possible to predict whether or not MF theory will 
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FIG. 1: (Color online) Order parameter for synchronization 
in the Kuramoto phase oscillator model running on (a) the 
Facebook Oklahoma network Q1 and (b) the US power grid 
network [2l[ as a function of the coupling K. The order pa- 
rameter r 2 is defined in Eq. J2]). The MF theory of [22| is 
given by Eq. ©. 



tion. The phase 6j(t) of the oscillator at node j obeys 
the differential equation 



dt 



N 



sin ( 



9j) , 



(1) 



where ujj is the intrinsic frequency of node j, iV is the 
number of nodes, and A is the adjacency matrix of the 
network. The coupling to network neighbors is measured 
by the parameter K, and global synchrony of the oscil- 
lators is expected to emerge for sufficiently large K 1241 . 
Synchrony is quantified using the order parameter [22| 
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where kj is the degree of node j 
MF theory of Ref. Q (see also [| 
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and i = y/—l. The 
I) yields the following 



provide a good approximation to the actual dynamics on 
this network? In this paper, we test multiple well-studied 
dynamical processes on 21 undirected, unweighted real- 
world networks. We enumerate and summarize various 
properties of the networks in Tabic HI These networks are 
characterized by a range of values for several standard 
network diagnostics, which is important for this study. 
We show that MF theory typically works well provided 
d, the mean degree of first neighbors of a random node, 
is sufficiently large. In contrast, we demonstrate that the 
mean degree z of the network is not necessarily a good 
indicator of MF accuracy. 

The remainder of this paper is organized as follows. In 
Section |nl we introduce the dynamical processes that we 
consider and compare numerical results with MF theory 
for sample real- world networks. In Section lllll we discuss 
the implications of our results and propose an explana- 
tory hypothesis. In Section IIV[ we compare our results 
with earlier work in this area. We conclude in Section fVl 



II. EXAMPLES 

We begin by showing examples for which MF theory 
gives accurate results for dynamics on real-world net- 
works, contrasting with examples in which MF theory 
performs poorly. 



A. Kuramoto Phase Oscillator Model 

In Fig. [TJ we show the results of running the Kuramoto 
phase oscillator model [HI on the Facebook Oklahoma 
network 11 



and on the US Power Grid network [21 
Each node corresponds to an oscillator with an intrinsic 
frequency drawn from a unit- variance Gaussian distribu- 
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where /„ is the modified Bessel function of the first 
kind, pk is the degree distribution of the network, and 
z = (k) = kpk is the mean degree. The agreement 
in Fig. [T] between theory and simulation is very good for 
the Facebook network but very poor for the Power Grid. 
(See Fig. 2] for additional examples.) 

The results of Fig. [T]arc perhaps explained in part by 
noting that the mean degree z of the Facebook network is 
102, whereas z 2.67 for the Power Grid (see Table 0}. 
It is arguable that nodes with many neighbors will ex- 
perience something closer to a "mean field" than nodes 
with few neighbors. In particular, it is plausible that 
low-z networks might be more prone to errors in MF due 
to neglecting the effects of clustering, modularity, and 
dynamical correlations. This is attractively simple, but 
as we show below, this naive explanation does not fully 
capture certain subtleties of this question. 



B. SIS Epidemic Model 

In Fig. [2 we compare simulations for the susceptiblc- 
infected-susceptible (SIS) epidemic model [26H28{ with 
the corresponding predictions of a well-known MF theory 
HH. In the MF theory, the fraction ik(t) of degree-fc 
nodes that are infected at time t is given by the solution 
of the equation 



dik 
dt 



-i k + P{1 - ik)kO k , 



(4) 
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FIG. 2: (Color online) Fraction of infected nodes in the steady 
state of the SIS process on the (a) AS Internet [2^| and (b) 
Electronic Circuit (s838) [3(| networks as a function of the 
spreading rate /3. The MF theory is from Refs. 00. Ob- 
serve that uncorrelated and correlated MF theories are indis- 
tinguishable in panel (b). 



where /3 is the spreading rate, the recovery rate has been 
set to unity by choice of timescale, and 



e k = J2nk'\k) 



(5) 



is the probability that any given neighbor of a degree- k 
node is infected. In Eq. (J5J, P(k'\k) is the probability 
that an edge originating at a dcgree-A: node has a degree- 
k' node at its other end. Because degree-degree corre- 
lations are included, this version of the theory is called 
a correlated MF theory (cMF). A further simplification 
of the theory is possible if one assumes that the network 
is uncorrelated and is thus completely described by its 
degree distribution p k . In this case, which is termed un- 
correlated MF (uMF), P(k'\k) in Eq. © is replaced by 
k'pk'/z and 9^ becomes independent of k. In Fig.[2J we 
show predictions of both correlated and uncorrelated MF 
theories for the steady-state endemic infected fraction 



lim y^Pkik(t) 
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for the AS Internet network [29[ and for the Electronic 
circuit (s838) network [3(| • The MF theory is very accu- 
rate for the former, but it performs poorly for the latter. 
The result for the AS Internet network is particularly sur- 
prising in light of the fact that the mean degree of this 
network is only 4. The aforementioned naive argument 
that MF theory is accurate for high-degree nodes thus 
cannot account for the good performance of the theory 
in this low-z case, where 96% of the nodes in the network 
have degree 10 or less (see Table IJ). 



C. Voter Model 

As a third example, we consider the survival proba- 
bility of disordercd-statc trajectories in the voter model 
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FIG. 3: (Color online) Survival probability for voter-model 
trajectories in the disordered state as a function of time on (a) 
C. elegans neural network [2l[ and (b) a synthetic clustered 
network generated as described briefly in the main text and 
in detail in Ref. JlJJ] with 7(3, 3) = 1. Both networks contain 
approximately N = 300 nodes. The theory curves are from 
Refs. ddl. 



[34| and compare it with the (uncorrelated) MF theory 
of Ref. [331 . (For rigorous results for the voter model, 
see Refs. [2, [35[ and references therein.) At time t — 0, 
each node is randomly assigned to one of two voter 
states. In each time step (of size dt = l/N), a randomly- 
chosen node is updated by copying the state of one of its 
randomly-chosen neighbors. On finite networks, the dy- 
namics eventually drive a connected component to com- 
plete order, in which all the nodes are in the same state. 
The survival probability P s (t) is defined as the fraction of 
realizations that remain in the disordered state at time 
t. The red dashed curves in Fig. [3] gives the survival 
probability predicted by the MF theory of Ref. [33[ : 
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and the black solid curve gives the results of the pair 
approximation (PA) theory of Ref. [f| : 
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In Fig. EHa), we show results for the C. elegans neural 
network [21( (z « 14.46), for which MF theory is very 
accurate. In Fig. [3Jb) , we show results for a synthetic 
clustered network described in Ref. [13]: Every node in 
the network has degree 3 and is part of a single 3-clique. 
Using the notation of Ref. [TJJ, this is called a 7(3, 3) = 1 
network. It can alternatively be described in the nota- 
tion of Ref. § as having pi : i = 1, as every node is part 
of one triangle and has one single edge other than those 
belonging to the triangle. Clustering has a strong effect 
in this (low- z) network; because of this clustering and dy- 
namical correlations, MF theory is very inaccurate. We 
can make the clustering negligibly small while keeping 
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the degree distribution unchanged (at pk = Sk,3) by ran- 
domly rewiring the network to give a random 3-regular 
graph. However, even after this rewiring, the match to 
MF theory is poor because dynamical correlations are 
still neglected. The recent PA theory for the voter model 
Q (see also Q) accounts for the dynamical correlations 
and hence gives a good match to the survival probability 
on the rewired network but not on the clustered original 
network. 



III. WHY IS MEAN-FIELD THEORY 
ACCURATE? 

Briefly summarizing our observations thus far, we have 
seen (i) situations in which high-z networks exhibit good 
matches to MF theory, but also (ii) some examples in 
which low- 2; networks also have accurate MF theories. 
For clarity, we have discussed only a few examples in de- 
tail, but these are representative of behavior observed for 
different dynamical processes on a variety of real-world 
networks. In Fig. 21 we show additional examples for 
each of the three dynamical processes (the Kuramoto, 
SIS, and voter models) for each of the 6 networks used 
in Figs. HH31 



A. Mean-Field Accuracy for the SIS Model 

Clearly, the success of MF theories for dynamical sys- 
tems on networks cannot be explained purely in terms of 
the mean degree z of the underlying network. Figure[5Ja) 
in particular gives an example in which MF theory works 
well on a low-z network. To understand this seemingly 
surprising accuracy, we focus on the SIS model and con- 
sider how the state of a node is updated as compared to 
the assumptions of MF theory. Suppose the state of the 
degree-fc node B\ is being updated. In both the true dy- 
namics and in MF theory, the updating process depends 
on the state of the neighbors of B-y. Let's take node B2 as 
a representative neighbor of B\ and suppose that B2 has 
degree k'. Under MF assumption (hi), the state of node 
B 2 is considered to be independent of the state of node 
B\ . This is why Eq. involves the term iy , which is the 
probability that degree-fc' nodes are in the infected state, 
without any conditioning on the state of their neighbor 
B\ (Z|. 

In reality, however, the states of nodes B\ and B% ex- 
hibit dynamical correlations. For example, during an epi- 
demic, an infected node is more likely to have infected 
neighbors than a susceptible node. Such dynamical cor- 
relations can be included explicitly in pair-approximation 
theories j53l - l55| . and their neglect can be a major source 
of error in MF theories. This suggests an important ques- 
tion: Under what circumstances might the MF assump- 
tion of dynamical independence (hi) still give accurate 
results for the update of node B\l One can argue that 
if the degree k' of node B2 is sufficiently large, then the 



state of node Bi is influenced by many of its neighbors 
other than node B\ , so the error in neglecting the particu- 
lar dynamical correlation between B2 and B\ is relatively 
small for the purpose of updating node B\ . Conversely, if 
the degree k' of node £>o is small, then node B\ has a rel- 
atively strong influence on the state of node B2 , and ne- 
glecting dynamical correlations between B2 and B\ when 
updating B\ will yield large errors. Hence, we expect MF 
theory to give reasonably accurate updates for node B\ if 
its neighbors have sufficiently high degrees. Importantly, 
this argument relies only on the degree k! of the nearest- 
neighbor nodes being high and gives no restriction on the 
degree k of the updating node itself. (We remark that the 
use of networks with nearest-neighbor nodes of high de- 
gree has also been mentioned in studies of the Kuramoto 
model on networks [13, HH-) 

In short, we argue that MF theory gives relatively 
small error for nodes with high-degree neighbors. More 
specifically, MF theory is likely to be inaccurate if many 
nodes do not have high-degree neighbors. This can hap- 
pen, for example, if low-degree nodes are connected pref- 
erentially to other low-degree nodes (a sort of "poor- 
club phenomenon" , akin to the rich-club phenomenon 
of high-degree nodes connecting preferentially to other 
high-degree nodes [HI). This suggests a simple but ef- 
fective predictor of MF accuracy: If the mean degree of 
first-neighbors d = J2k Vk Ylk' k'P{k'\k) is high, then MF 
theory can be expected to be accurate. As we show be- 
low, this rule of thumb works well for SIS dynamics on 
all of the networks that we have considered. 

In Fig. [5] we present results from numerical simulations 
of SIS dynamics and consider the final, steady-state frac- 
tion / of infected nodes (for each network we average over 
an ensemble of more than several hundred realizations, 
in each of which 5% of the nodes are randomly chosen 
to be infected at t = 0). Because the quality of the MF 
approximation is known to depend on the number of in- 
fected nodes (5(| , we further compare errors for different 
networks by choosing the spreading rate /3 in each sim- 
ulation so that the MF steady-state value /theory equals 
0.5. Letting 

t-, -^theory -^numerical / \ 

h S = j (8) 

theory 

be the relative error between the theoretical predictions 
of correlated MF theory (H])-© and the numerical re- 
sults, we color each network's symbol in Fig. [5] by the 
value of Eg (see Table U for the error values). The re- 
sults in Fig. [5] show clearly that situations with low d 
and low z correspond to inaccurate cMF predictions [see 
Figs. [ljb), [UJb), and[3]Jb)] and that the high-d situations 
(some of which also have small z) all have accurate cMF 
predictions, supporting our claim that the fidelity of MF 
theory can be evaluated using d. 

In Section llVl we describe an alternative measure for 
predicting MF accuracy that uses inter- vertex distances 
[52|. One can also construct other, more complicated 
measures (e.g., by computing the size of the connected 
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FIG. 4: (Color online) Results for dynamics of Figs. [T}{3]for all 6 networks used in those figures. Curves and symbols are as 
in Figs. [JH3] For the voter model, black squares show the numerical results obtained by rewiring networks in a manner that 
conserves both degree distribution and degree-degree correlations [5^ |. 
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FIG. 5: (Color online) Values of mean degree z and mean 
nearest-neighbor degree d for many real-world networks. The 
numbers that label the networks are enumerated in column 
1 of Table [I] The colors indicate the magnitude of the rela- 
tive error Es, defined in Eq. (J8J> , between cMF theory and 
numerical simulations for SIS dynamics with Itheory = 0.5. 



cluster of low-degree nodes); however, the mean first- 
neighbor degree d is appealing because it is simple to 
calculate and understand, and it retains considerable ex- 
planatory power. We note that similarly accurate results 
have been found for real- world networks usin g M F theory 
for a discrete-time version of the SIS model [56;, i57L 



B. Isolating the effect of dynamical correlations 
using synthetic networks 

We have argued above that the observed accuracy of 
MF theory on some real-world networks is due to their 
high d values ameliorating the neglect of dynamical cor- 
relations [i.e., MF assumption (iii)]. However, real-world 
networks typically have high values for clustering coeffi- 
cients and significant modular structures, so MF theory 
for such networks violates assumptions (i) and (ii) as well 
as assumption (iii). It might therefore be argued that the 
high-d effect seen in Fig. [5] could be due to an improve- 
ment in the validity of assumptions (i) and/or (ii), which 
could in principle have a larger impact than the high-d 
improvement of assumption (iii). We investigate this by 
now considering SIS dynamics on synthetic random net- 
works (with N = 10 4 nodes) in which the transitivity 
and community structure are both negligible. Thus, MF 
assumptions (i) and (ii) are both valid for these networks, 
and the error in MF theory can be due only to violations 
of assumption (iii). 

The first family of synthetic networks we use is de- 
scribed in Ref. [58j |: Each node is either of low degree 
k\ [with probability = ^/(fci + ^'2)] or of high de- 
gree &2 [with probability pk 2 = fci/(fci + fe)]- In order 
to create a network with a prescribed degree-degree cor- 
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FIG. 6: Error (diamonds) of cMF theory in predicting steady- 
state infected fraction for SIS dynamics as a function of Pear- 
son correlation coefficient r in the synthetic networks (de- 
scribed in the text) with (ki,k2) = (3,12). Also shown are 
the errors for the class of degree-3 nodes (crosses) and for 
the class of degree-12 nodes (triangles): these are calculated 
similar to Eq. ((SJ), using ifc(oo) (from Eq. Q) for /theory, and 
the average infected fraction of degree-fc nodes in simulations 
for /numerical- The solid line gives the mean nearest-neighbor 
degree d. 
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FIG. 7: (Color online) Mean (black squares) and standard 
deviation (error bars show one standard deviation above and 
below the mean) of the distribution of fi values (described in 
text) on the same networks as in Fig. [6] The values of /theory 
are given by cMF theory, using 13(00) from Eq. 



relation coefficient r, we connect the nodes of each type 
preferentially to nodes of either the same or of oppo- 
site type. We show in Fig. [5] how the aggregate error 
Es (diamonds) depends on the correlation coefficient r 
for a specific case with (ki, £2) = (3, 12). Note that the 
mean degree z of these networks is fixed (z = 4.8), but 
the mean first-neighbor degree d decreases as r increases. 
Figure [6] illustrates that the highest error for the MF the- 
ory occurs when d is lowest. This is the fully assortative 
(r = 1) case in which low-degree nodes link only to other 
low-degree nodes, creating a \ow-k connected cluster in 
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FIG. 8: Relative errors Esi (symbols), given by Eq. (|10p . in 
the calculation of the fraction of SI edges using MF assump- 
tion (iii) on the same networks as in Fig. [6] The solid line 
gives the mean nearest-neighbor degree d. 



which MF theory is inaccurate. At the other extreme, the 
disassortative (r = —1) case has every low-degree node 
linked only to high-degree nodes, with a consequent re- 
duction in the error of MF theory (and a high value of 
d). The trends of the errors in each degree class also sup- 
port our argument: high-degree neighbors correspond to 
lower MF error. 

In Fig. [7J we show further details about the degree-3 
nodes. To examine the importance of assumption (ii) we 
consider (for each value of r) an ensemble of M = 25 
realizations of the SIS process on identical copies of the 
synthetic network. At a fixed (large) time, we record 
the state — either infected or susceptible — of each node i. 
Taking the average over the M realizations defines fi, the 
fraction of realizations in which node i is infected at the 
chosen time. We now consider the distribution of fi val- 
ues over the set of nodes i that all share the same degree 
k, noting that assumption (ii) implies that these fi val- 
ues should be identical for every node i in a given fc-class. 
Consequently, we plot in Fig. for dcgrce-3 nodes, the 
mean and standard deviation of the fi values for the same 
networks as used in Fig. [5] Note the mean value equals 
^numerical {k = 3), the average infected fraction of degree-3 
nodes in simulations. This mean value deviates from the 
cMF theory prediction 13(00), giving the errors shown by 
the crosses in Fig. [5] We note that the standard devia- 
tion of the /, values does not depend strongly on r (and 
hence also does not depend strongly on d). Moreover, if 
all of the fi values for each r value were to be replaced by 
their mean value — so that the standard deviation for fi 
was zero, in accord with assumption (ii) — then the errors 
shown in Fig. [6] would be unaltered. In this sense, the 
violation of assumption (ii) is not the main source of the 
high-<i effect on the accuracy of MF theory. 

To test for the existence of dynamical correlations — 
which are neglected under MF theory assumption (iii) — 
in the numerical results, we calculate the fraction 4>si of 
SI edges (i.e., edges linking a susceptible node and an 



infected node) in the networks in steady state. We then 
compare this to the fraction of SI edges that would be 
obtained if no dynamical correlations were present. This 
is given by treating as independent the probabilities for 
nodes at each end of an edge to be in states S and I: 
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where p k is the fraction of infected degree- k nodes in the 
network (hence (1 — pk>) is the fraction of degree-fc' nodes 
that are susceptible) . Note that we calculate pk from the 
numerical simulations — we do not use the corresponding 
MF theory values Zfc(oo) — so that we are directly testing 
the validity of assumption (iii) on the numerical data. 
The relative error of the MF assumption on the SI edge 
fraction can then be calculated in a similar fashion to 
Eq. ©: 
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If assumption (iii) were true, there would be no dynami- 
cal correlation effects and the fraction of SI edges could 
be computed directly from the fraction of infected nodes 
using ©, giving Esi = 0. However, Fig. [5] shows that 
the error Esi increases as the mean first-neighbor dis- 
tance decreases, which is similar to the trend of error E$ 
in Fig. [5] This evidence supports our claim that it is MF 
assumption (iii) — the neglect of dynamical correlations — 
that plays a dominant role in determining the accuracy 
of MF theory on high-d networks. 

The second family of synthetic networks is composed of 
networks with negligible degree-degree correlations, and 
is generated using the configuration model f59j . In these 
networks, d = (fc 2 ) / z, so the mean first-neighbor degree 
increases with the second moment of the degree distribu- 
tion if the mean degree z is fixed. For example, one can 
construct networks with z = 5 with the degree probabil- 
ities 
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and pk — for all other k. Such a network has a fraction 
a of nodes with degree 5, and the remaining nodes have 
degrees 3 and 20. The mean first-neighbor degree for 
such networks is d = 11 — 6a, so it decreases linearly 
with a. By comparing numerical simulations for the SIS 
model with MF theory (not shown), we find that the error 
magnitude \Es\ increases monotonically with a. It takes 
the value 0.07 at a = (with d = 11) and the value 0.16 
at a = 1 (with d = 5). Similar to the correlated synthetic 
networks of Fig.[6j the higher values of d thus correspond 
to lower values of the error. 

The evidence from both families of synthetic networks 
suggests that the high-d effect that we have observed can 
be due only to its impact on dynamical correlations [i.e., 
assumption (iii)]. In real- world networks, d can presum- 
ably affect the validity of all three MF assumptions. Fur- 
ther work is required to understand which assumption(s) 
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have the strongest impact on MF accuracy in such situ- 
ations. 



C. Mean-Field Accuracy for Other Dynamical 
Processes 

The argument that we have given above for the useful- 
ness of d as a measure of MF accuracy is specific to SIS 
dynamics, where the quantity of interest is the (ensemble- 
averaged) infected fraction of nodes. We now consider 
error measures on the (z, d) plane for the other dynamics 
studied here, the Kuramoto and voter models. For the 
Kuramoto model, wc define an error measure in terms of 
the r2 order parameter from Eq. ([2j as 

T-i theory ^2 numerical /io\ 
E K = , [LZ) 

Tl theory 

which wc evaluate at the value of K for which T2 theory = 
0.6. Similarly, a measure of relative error for the voter 
model is given by 

_ loglO {Ps theory) ~ loglQ {Ps numerical) 
l°SlO {Ps theory) 

where P s theory and P s numerical are the survival proba- 
bilities given by Eq. (|6|) and by numerical simulations, 
respectively. We evaluate these quantities at the time 
t that corresponds to a survival probability of one per- 
cent (in MF theory): P s theory (i) = 10~ 2 . This definition 
reflects the vertical difference between the dashed curve 
and the symbols in Fig. [3] at a specific value of t. 

We give the measured values for Ey and Ex for all 
real-world networks in Tablc|H and Figm-cs^a) and^b) 
show how these values depend on the mean degree z and 
mean first-neighbor degree d of the networks. The Ku- 
ramoto model exhibits a similar pattern to SIS (compare 
Fig. 0(a) to Fig. [5]): high-d networks have lower errors 
than low-d networks. However, the high-d effect does 
not seem to impact the voter model [see Fig. (Hlb)] in 
the same way. We can understand this by contrasting 
the predicted quantities for SIS and for the voter model. 
For SIS, the error is low when MF theory accurately pre- 
dicts the fraction I(t) of infected nodes. In the voter 
model, the quantity corresponding to I(t) is the fraction 
of nodes in one of the two voter states, which we denote 
by Iy (t) . When initial states are randomly assigned, MF 
theory predicts that Iy(t) is conserved, which implies 
that Iy(t) = 1/2 for all t. Numerical simulations on real- 
world networks also give Iy(t) = 1/2, but only when one 
averages over an ensemble of realizations. In any single 
realization on a network of finite size, fluctuations eventu- 
ally lead to the entire network becoming ordered — i.e., all 
nodes eventually share the same opinion. (In half of the 
realizations, this shared opinion is one voter state; in the 
other half it is the other state.) It is this single-realization 
ordering process that is measured by the survival proba- 
bility P s {t). In this respect, the voter model is different 




2 io 1 io 2 



z 

FIG. 9: (Color online) As in Fig. O except that color indi- 
cates the magnitude of the relative error, which is determined 
by comparing numerical simulation results to: (a) MF theory 
for the Kuramoto order parameter T2 (with T2 theory = 0.6), 
from Eq. (|12[1 : (b) MF theory for the voter model survival 
probability P s (t) at the time t (with P s theory = 10 -2 ) from 
Eq. (|13[) : and (c) PA theory for the voter model survival prob- 
ability. 
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from both the SIS and Kuramoto models, in which the io 3 E .— 

trajectory of a typical realization is qualitatively similar 
to an ensemble-average of all trajectories. 

In order to better capture quantities of higher-order § 
than Iv{t)i such as P s (t), it is necessary to approximate 2 
the dynamical correlations between nodes using, for ex- o 1 
ample, a pair-approximation method. In Fig. Etc), we 
show the magnitude of the PA error for the voter model; 
we still measure the error using Eq. (|13p . but we use the 
PA theory of Eq. instead of the MF theory of Eq. © . W 1 - 

Observe the improvement in accuracy over the MF theory 
of Fig. |9fb), particularly for high-d networks. Similar to 
MF for SIS (see Fig. [5J and for Kuramoto [see Fig. if a)] , 
only networks with both low z and low d have relative 



5 6 4 



19 
□ 



errors that significantly exceed 5%. 10 1 10 2 10 1 10° io 1 

1 



IV. COMPARISON WITH AN ALTERNATIVE 
MEASURE 

Using numerous examples of real-world and synthetic 
networks, we have illustrated that the mean degree of 
first-neighbors d is a good indicator of the accuracy of 
MF theories for a variety of dynamical processes on net- 
works. One can also construct more complicated ac- 
curacy measures, which may in general depend on the 
dynamics under scrutiny. In Ref. [EH, for example, we 
examined (among other dynamics) the accuracy of the 
bond percolation theory of [60j by comparing its pre- 
dictions with numerical calculations of the sizes of the 
largest connected component for several real-world net- 
works. We showed that a good measure of the error is 
given by the quantity 



where i is the mean inter-vertex distance in the orig- 
inal (clustered) network, i\ is the corresponding mean 
distance in a rewired version of the network (using a 
rewiring process that preserves degree-degree correlations 
but reduces clustering), and z is the mean degree. Not- 
ing that the bond percolation theory of Ref. [60[ is of 
pair-approximation (PA) type, in contrast to the mean 
field theories on which we focus in this paper, it is never- 
theless of interest to examine the relation between q and 
the mean first-neighbor degree d that we have identified 
in this paper as an indicator of MF theory accuracy for 
several dynamical processes. 

In the (q, d) parameter plane of Fig. [TU1 we show the 
positions of those real- world networks from Fig. [5] which 
were also examined in Ref. The expected relation- 

ship between q and d is revealed: Networks with high d 
have low q (and hence, according to Ref. have low 
error for bond percolation PA theory), while low d val- 
ues correspond to high q and hence to large errors. Thus, 
despite its simplicity, d performs well when compared to 
other more involved measures. As noted above, there is 
scope for further work on developing more complicated 



FIG. 10: (Color online) Location of real-world networks in 
the (q, d) parameter plane, where q is the error measure (|14|l , 
which was shown in Ref. [52|] to be correlated with the error 
of PA theory for bond percolation. We use the same symbols 
as in Fig. [5] 

diagnostics (such as q) for predicting MF accuracy for 
various dynamics, but the use of the d value is appealing 
because it is simple to calculate and aids in understand- 
ing the underlying causes of MF inaccuracies. 

V. CONCLUSIONS 

In summary, we have shown that MF theory works best 
for networks in which low-degree nodes (if present) are 
connected to high-degree nodes (i.e., for networks that 
are either disassortative by degree or have high mean de- 
gree z). Remarkably, it is not necessary that the mean de- 
gree of the network be large for MF theory to work well — 
at least for ensemble-averaged quantities [see Figs. [5] and 
Ufa)]. In addition to the 21 real- world networks that we 
have studied, we have presented evidence from synthetic 
networks to support our hypothesis. We stress that our 
error measures focus on behavior far from critical points; 
the accuracy of MF for the calculation of phase tran- 
sition points (such as the value of K for the onset of 
synchronization in the Kuramoto model [till |62| or the 
SIS epidemic threshold 0, [63l - l65j ) is a topic for future 
work. 

Based on our results for the voter model in Scc. lIIIl we 
expect that similar conclusions should hold for the ap- 
plicability of pair-approximation theories (such as those 
in refs. 0, H3|) for dynamics on real- world networks. Al- 
though PA theories account for the dynamical correla- 
tions that plague MF theories, they remain vulnerable 
to the effects of network clustering and modularity when 
d is low. For example, Fig. [3jb) gives an example in 
which PA theory works well only on the rewired (and 
hence unclustered) version of a (low-<i) network. This 
suggests that PA theory (like MF theory) is most accu- 
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rate for real- world networks with either high mean degree 
or high mean first-neighbor degree. 
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Word adjacency: Spanish 


11558 


7.45 


942.58 


0.91 


0.38 


0.02 


-0.2819 


-0.03 


0.01 


0.21 


0.04 


[301 


2 


Word adjacency: English 


7377 


11.98 


666.03 


0.81 


0.41 


0.04 


-0.2366 


-0.03 


0.01 


0.12 


0.03 


[301 


3 


AS Internet 


28311 


4.00 


473.65 


0.96 


0.21 


0.01 


-0.2000 


0.08 


0.03 


0.44 


0.08 


1291 


4 


Word adjacency: French 


8308 


5.74 


376.22 


0.93 


0.21 


0.01 


-0.2330 


-0.02 


0.03 


0.23 


0.003 


1301 


5 


Marvel comics 


6449 


52.17 


338.16 


0.25 


0.78 


0.19 


-0.1647 


-0.03 


0.02 


0.03 


0.008 


1661 


6 


Reuters 9/11 news 


13308 


22.25 


236.17 


0.70 


0.37 


0.11 


-0.1090 


-0.01 


0.01 


0.04 


-0.009 


1671 


7 


Word adjacency: Japanese 


2698 


5.93 


199.99 


0.92 


0.22 


0.03 


-0.2590 


-0.02 


0.02 


0.24 


0.01 


[30J 
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Faccbook Oklahoma 


17420 


102.47 


186.04 


0.11 


0.23 


0.16 


0.0737 


0.03 


0.01 


0.02 


0.01 


[19. 20] 
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Corporate ownership (EVA) 


4475 


2.08 


113.85 


0.98 


0.01 


0.00 


-0.1851 


0.80 


0.06 


1.00 


0.65 


[68] 


10 


Political blogs 


1222 


27.36 


100.07 


0.45 


0.32 


0.23 


-0.2213 


0.04 


0.01 


0.05 


0.009 


[69] 


11 


Facebook Caltcch 


762 


43.70 


74.65 


0.20 


0.41 


0.29 


-0.0662 


0.004 


0.01 


0.04 


0.009 


[19, 20} 


12 


Airports500 


500 


11.92 


59.50 


0.76 


0.62 


0.35 


-0.2679 


-0.002 


0.06 


0.29 


0.19 


[37. 381 


13 


C. Elegans Metabolic 


453 


8.94 


51.57 


0.86 


0.65 


0.12 


-0.2258 


0.03 


0.04 


0.18 


0.04 


[39, 40] 


14 


Interacting Proteins 


4713 


6.30 


32.92 


0.84 


0.09 


0.06 


-0.1360 


0.14 


0.04 


0.19 


-0.03 


[41-43J 


15 


C. Elegans Neural 


297 


14.46 


32.00 


0.41 


0.29 


0.18 


-0.1632 


0.10 


0.03 


0.09 


0.006 


[21, 44] 


16 


Transcription yeast 


662 


3.21 


22.31 


0.95 


0.05 


0.02 


-0.4098 


0.64 


0.08 


0.70 


0.36 


[70] 


17 


Coauthorships 


39577 


8.88 


20.17 


0.77 


0.65 


0.25 


0.1863 


0.22 


0.08 


0.14 


0.01 


[45. 461 
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Transcription E. coli 


328 


2.78 


17.88 


0.96 


0.11 


0.02 


-0.2648 


0.48 


0.09 


0.79 


0.43 


[71] 


19 


PGP Network 


10680 


4.55 


13.46 


0.90 


0.27 


0.38 


0.2382 


0.50 


0.16 


0.45 


0.17 


[47-49] 


20 


Electronic Circuit (s838) 


512 


3.20 


4.01 


1.00 


0.05 


0.05 


-0.0300 


0.78 


0.23 


0.62 


0.23 


[30J 


21 


Power Grid 


4941 


2.67 


3.97 


0.99 


0.08 


0.10 


0.0035 


0.93 


0.29 


0.90 


0.65 


[21, 50] 


22 


7-theory [7(3,3) — 1] 


1002 


3 


3 


1.00 


1/3 


1/3 


N/A 


0.91 


0.89 


0.78 


0.47 


[10] 


23 


7-theory [7(3,3) — 1] 


10002 


3 


3 


1.00 


1/3 


1/3 


N/A 


0.97 


0.88 


0.79 


0.49 
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TABLE I: Basic diagnostics and error measures for the networks used in this paper. All real-world data have been treated in the form of undirected, unweighted 
networks. We only consider the largest connected component of each network, for which we enumerate the following properties: total number of nodes N; mean degree 
2; mean first-neighbor degree d; fraction of nodes of degree 10 or less; clustering coefficients C and C (defined, respectively, in Eqs. (3.6) and (3.4) of Ref. [5l[); and 
Pearson degree correlation coefficient r. The quantities Es, Ek and Ev are the relative MF errors for the SIS, Kuramoto, and voter models, Ey A is the relative PA 
error for the voter model. The last column gives the citation for the network in this paper's bibliography and/or the URL of a data file. 



